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Abstract 

This  paper  provides  a  framework  for  analyzing  white  noise  disturbances  in  linear  sys¬ 
tems.  Rather  than  the  usual  stochastic  approach,  noise  signals  are  described  as  elements 
in  sets  and  the  disturbance  rejection  properties  of  the  system  are  considered  in  a  worst 
case  setting.  The  description  is  based  on  constraints  in  signal  space,  directly  verifiable 
on  experimental  data.  These  constraints  can  be  given  a  representation  compatible  with 
standard  robust  control,  allowing  the  formulation  of  white  noise  rejection  problems  in  the 
presence  of  other  sources  of  uncertainty.  It  is  also  shown  how  the  framework  can  capture 
as  a  special  case  the  usual  stochastic  approach,  with  equivalent  results. 


1  Introduction 

Inaccuracies  in  mathematical  models  of  physical  systems  are  often  characterized  by  the  in¬ 
troduction  of  external  disturbances,  which  account  for  phenomena  which  are  too  complex  or 
unpredictable  to  be  conveniently  captured  by  the  model.  A  model  must  then  be  accompa¬ 
nied  by  a  description  of  the  disturbance,  and.  this  implies  a  basic  choice  in  the  mathematical 
framework.  The  deterministic  approach  is  to  specify  a  set  of  allowable  disturbances,  and  leads 
to  worst  case  analysis  over  this  set.  Alternatively,  the  stochastic  paradigm  specifies  a  measure 
(probability  distribution)  in  the  disturbances,  and  leads  naturally  to  analysis  in  the  average. 
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The  two  relevant  factors  underlying  this  choice  are  mathematical  convenience  and  the 
objective  of  obtaining  a  realistic  and  tight  characterization  of  the  empirical  disturbance. 

Probabilistic  models  have  been  the  typical  choice  for  white  noise  disturbances,  which 
appear,  for  example,  when  considering  the  cumulative  macroscopical  effect  of  very  high  di¬ 
mensional  fluctuations  at  the  microscopic  level.  Indeed,  a  stochastic  process  seems  to  provide 
a  very  good  approximation  to  these  phenomena,  and  in  many  cases  this  leads  to  tractable 
mathematics.  An  example  of  this  in  control  theory  is  the  classical  H2  (LQG)  disturbance 
rejection  problem,  namely  the  design  of  a  feedback  system  which  minimizes  the  (average) 
sensitivity  of  a  linear  system  to  stochastic  noise. 

The  stochastic  paradigm  is  less  attractive,  however,  in  practical  control  problems  where 
disturbances  coexist  with  other  forms  of  uncertainty,  such  as  unmodeled  dynamics,  which  are 
described  more  naturally  in  a  deterministic  setting.  In  this  robust  control  setting,  mathemati¬ 
cal  convenience  calls  for  deterministic  disturbance  rejection  problems  (e.g.  Tfoo?  Ti)  where  the 
disturbance  is  allowed  to  vary  in  a  given  set  (e.g.,  the  unit  ball  of  £2)  Too)-  Even  though  this 
leads  to  a  successful  robustness  analysis  methodology,  some  conservatism  is  involved  in  these 
deterministic  classes,  since  the  signals  which  give  worst  case  performance  (e.g.  persisting 
sinusoids,  in  the  case)  are  often  very  unlikely  disturbances  in  practical  situations. 

Attempts  to  combine  deterministic  uncertainty  with  stochastic  white  noise  (the  “Robust 
H2”  problem,  see  for  example  [17])  face  the  difficulty  of  analyzing  simultaneously  the  worst 
case  effect  of  the  uncertainty  and  the  average  effect  of  the  disturbance,  and  have  only  resulted 
in  upper  bounds.  Similar  difficulties  arise  when  attempting  to  establish  closer  connections 
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between  classical  system  identification  and  robust  control,  since  the  former  relies  entirely  in 
the  stochastic  paradigm  for  disturbances. 

An  important  remark  is  that  there  is  nothing  inherently  stochastic  about  white  noise:  it  is 
known  that  deterministic  chaotic  systems  can  produce  spectral  effects  indistinguishable  from 
random  noise. 

This  discussion  leads  us  to  the  main  objective  of  this  paper,  which  is  to  obtain  tight 
deterministic  descriptions  for  white  noise,  suitable  for  robust  control  purposes.  We  now 
specify  in  more  detail  the  objectives  these  set  descriptions  should  meet: 

1.  They  must  allow  a  finite  time  horizon  formulation;  this  is  essential  if  these  descriptions 
are  to  be  used  in  practical  problems  involving  data,  such  as  system  identification. 

2.  They  must  be  rich  enough  to  include  “typical”  instances  of  stochastic  white  noise  signals. 

3.  They  must  be  tight  enough,  so  that  worst  case  rejection  properties  of  a  system  under 
disturbances  of  the  set  are  essentially  the  same  as  average  rejection  properties  under 
stochastic  white  noise. 

4.  They  must  allow  for  a  mathematical  formulation  similar  to  other  deterministic  descrip¬ 
tions  of  uncertainty,  to  permit  a  simple  formulation  of  robust  performance  problems. 

The  deterministic  approach  to  statistical  spectral  analysis  is  not  new  and  goes  as  far 
back  as  Wiener  [19];  a  modern  reference  is  [10].  However,  these  treatments  rely  entirely  on 
asymptotic  properties  of  signals  defined  on  infinite  time  intervals,  and  are  not  focused  on  the 
rejection  problem  and  the  related  robustness  issues. 
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In  what  follows  we  will  present  a  very  natural  formulation  compatible  with  the  previous 
requirements;  the  starting  point  is  the  following  question:  how  does  one  decide  whether  a 
signal  can  be  accurately  modeled  as  a  stochastic  white  noise  trajectory?  Deciding  this  from 
experimental  data  leads  to  a  statistical  hypothesis  test  on  a  finite  length  signal.  In  other 
words,  one  will  accept  a  signal  as  white  noise  if  it  belongs  to  a  certain  set.  The  main  idea  of 
our  formulation  is  to  take  this  set  as  the  definition  of  white  noise,  and  carry  out  the  subsequent 
analysis  in  a  deterministic  setting. 

The  paper  is  organized  as  follows:  Section  2  contains  the  notation  and  some  preliminary 
facts;  in  Section  3,  time  domain  descriptions  are  given,  and  they  are  analyzed  from  the 
point  of  view  of  the  competing  requirements  (2)  and  (3).  In  Section  4,  the  same  is  done 
with  descriptions  in  the  frequency  domain.  Section  5  provides  the  multivariable  extension 
of  the  previous  framework.  Section  6  summarizes  the  work  and  outlines  a  resulting  research 
direction,  which  allows  for  these  descriptions  to  be  cast  in  the  mathematical  framework  of 
robust  control.  Some  technical  proofs  are  covered  in  the  Appendix. 

2  Notation  and  Preliminaries 

This  paper  deals  with  discrete  time  signals  and  linear  time  invariant  systems.  In  sections  2, 
3,  and  4  we  will  consider  scalar  signals  and  single  input/single  output  (SISO)  systems.  The 
multivariable  version  is  given  in  Section  5.  Notation  and  elementary  properties  of  spectral 
analysis  are  presented  in  this  section,  in  both  finite  and  infinite  time  horizon  cases. 
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2.1  Finite  horizon  properties 


In  the  finite  horizon  case,  we  wish  to  characterize  white  signals  among  sequences  of  length 
N,  and  the  steady  state  response  of  a  linear  system  to  such  a  disturbance.  Therefore,  to 
get  sensible  answers  we  must  assume  that  N  is  much  larger  than  the  system  time  constants. 
Under  this  assumption,  the  system  gain  will  not  be  substantially  affected  if  we  consider  the 
signals  to  be  periodic,  with  period  N,  and  we  have  available  the  information  of  one  period. 
In  fact,  the  system  will  not  be  sensitive  to  these  “long  range”  correlations  we  have  introduced 
in  the  input  signals.  As  a  counterpart,  this  gives  tractable  expressions. 

Let  x(t )  be  a  periodic,  real  valued  signal,  of  period  N,  which  will  often  be  identified 
with  the  finite  sequence  x(0)  ■  ■  -x(N  —  1).  The  discrete  Fourier  transform  (DFT)  X(k),k  = 
0  •  •  •  N  —  1  of  the  sequence  x(t)  is  defined  by  the  relations 

*(*)  =  E  “  “  ;  <t)  =  L  £  X(ky*lt  (1) 

t= 0  iV  k= 0 

As  a  map  between  CN  and  CN  the  DFT  as  defined  is  a  unitary  operator  scaled  by  y/~N . 

The  (circular)  autocorrelation  sequence  of  x  (“correlogram”)  is  given  by 

N-l 

rx(r)  -  ^2  X(t  +  T)x(t)  t  =  0  ■  ■  ■  N  —  1  (2) 

t= o 

and  the  sequence  power  spectrum  (“periodogram”)  by  sx(k)  =  \X(k)\2,  k  =  0  •  •  •  N  —  1. 

It  is  easy  to  show  that  the  sequences  rx(r)  and  sx(k)  form  a  DFT  pair.  For  an  iV-periodic 
signal  x(t),  we  will  use  as  norm  the  energy  over  the  period,  ||z||2  =  r*(0)  =  ^  J2k=o  sx(k)- 

We  consider  a  stable,  LTI,  discrete  time  SISO  system  7i( A)  =  J2uz- oo  h(t)X ,  where  A  is 
the  shift  operator.  The  frequency  response  (Fourier  transform  of  h{t))  is  denoted  by  7 f(eJU'). 


5 


For  a  fixed  N,  denote  H(k )  =  Lile?  ”  k )  (if  TC  is  FIR  of  length  less  than  N,  this  corresponds 

to  the  DFT  of  h(f)).  The  autocorrelations  of  the  system  Ti  are  defined  by 

00 

rh(r)  =  h(t  +  T)h(t)  (3) 

t—  —  oo 

assuming  convergence  for  every  r  (this  is  true  if,  for  example,  7 i  is  causal,  exponentially 
stable;  this  will  be  assumed  in  the  rest  of  the  paper).  The  Fourier  transform  of  r/,(r)  is  the 
power  spectrum  sh(ejw)  =  |'7f(e-?w)|2.  Also,  the  2-norm  of  the  system  is  given  by 

mill  =  rh( 0)  =  £  h(i)2  =  7T-  /  sh{endu  (4) 

As  an  immediate  consequence  of  the  previous  definitions,  the  following  relationships  hold. 

Lemma  1  Let  TL  be  a  SISO  discrete  time  stable  system.  Let  u(t)  be  an  N-periodic  input 
signal  to  system  Ti,  y(t )  be  the  corresponding  steady  state  (periodic)  output.  Then 
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(*)  rAr)=  rh(t)ru(t-T ) 

t=  —  00 

(5) 

(ii)  sy(k)  =  \H(k)\2su(k) 

(6) 

2.2  Infinite  horizon  properties 

Autocorrelations  and  spectra  can  be  defined  in  different  classes  of  infinite  horizon  signals.  We 
will  mention  here  bounded  power  signals  and  bounded  energy  (/2)  signals.  In  the  bounded 
power  case,  consider  the  class 

BV  =  i  x(t)  :  rx(r)  =  Jim  ^ x(t  +  r)x(t)  exists  for  each  r|  (7) 

l  N  CO  t  =  _jy  J 

For  I2  (square-integrable)  sequences,  define  rx(r)  =  ( x ,  XTx)  as  in  (3).  In  both  cases,  it  can 
be  shown  that  there  exists  a  spectral  density  sx(eJU)  such  that 
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(8) 


rx(T)  =  ^  Jq  Sx(ejw)ejwTdu 

For  the  bounded  power  case,  this  follows  from  Bochner’s  theorem  (see  [5]),  and  sx(eju))du  is 
in  general  a  nonnegative  measure.  In  the  l2  case,  sx(e^)  =  \X(ejw)\2,  where  X(e ?u)  is  the 
Fourier  transform  of  x(t).  l2  is  equipped  with  the  standard  norm  ||a;||2  =  rx(0)5,  BV  with  the 
seminorm  ||a;||p  =  ^(0)^. 

Under  mild  assumptions  on  the  system  7 i,  (5)  carries  through  to  infinite  horizon.  Also,  the 
corresponding  extension  of  (6)  is  sy(eiw)  = 

3  Time  domain  descriptions 

3.1  Finite  horizon  descriptions 

The  starting  point  for  a  deterministic  white  noise  theory  is  to  characterize  white  signals  among 
all  sequences  of  length  N;  when  faced  with  the  problem  of  deciding  whether  an  empirical 
signal  is  a  sample  of  white  noise,  a  statistician  will  perform  a  hypothesis  test  in  terms  of 
some  statistic.  A  common  choice  is  the  sample  autocorrelation,  which  should  approximate 
the  expected  correlation  for  white  noise  (a  delta  function).  In  other  words  a  scalar  signal 
is  x(t )  categorized  as  white  if  rx(r )  is  small  compared  to  r^O)  for  nonzero  t  in  a  certain 
range  (e.g.  for  values  of  r  smaller  than  a  horizon  T).  Pictorially,  the  autocorrelation  plot, 
normalized  to  7^(0)  =  1,  must  fall  inside  a  band  around  zero,  of  width  7,  as  in  Figure  1. 

From  the  classical  statistical  point  of  view,  the  choice  of  7  is  associated  to  a  level  of 
significance  of  the  test,  which  in  turn  depends  on  some  stochastic  model.  But  regardless  of 
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Figure  1:  Autocorrelation  plot  of  a  pseudorandom  sequence 

the  reasoning  behind  this  choice,  ultimately  the  “whiteness”  of  the  signal  is  decided  in  terms 
of  a  set,  which  is  parameterized  by  7  (and  T).  This  motivates  the  following: 

Definition  1  A  signal  x(t),  t  =  0  •  ■  -N  —  1  is  said  to  be  white  with  accuracy  7  up  to  time 
lag  T  if  it  satisfies 

Mr)|  <  7^(0)  t  —  1-  ■  -T  (9) 

The  set  of  all  such  signals  is  denoted  Wn, 7)t- 

It  is  natural  to  introduce  a  horizon  T  in  which  the  autocorrelations  are  required  to  be 
small;  this  reduces  the  number  of  constraints,  and  if  the  response  of  a  system  is  to  be  analyzed, 
low  correlation  is  only  relevant  in  time  scales  where  the  system  responds  strongly.  Both  7 
and  T  are,  in  fact,  a  parameterization  of  a  rectangular  weight  function  which  specifies  our 
constraints  on  the  autocorrelation.  Other  shapes  of  this  weight  function  could  be  considered, 
and  the  following  results  can  be  extended  with  minor  modifications. 

The  response  of  an  LTI  system  to  signals  in  such  sets  will  now  be  analyzed.  The  worst 
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case  gain  of  the  system  under  signals  in  Wni7it  (a  seminorm  on  systems)  will  be  denoted 


\m 


Wn,-y,t 


sup  -j  y  =  Hu,  u  G  WNiliT,  ||m||  ^  0 


(10) 


Theorem  1  In  the  conditions  of  Lemma  1,  ifJ2- lr^(OI  <  °°> 


1.  If  u  G  WNi1it,  then 


r+T 


l^-r»(r)|<7  E  W‘)l+  E  MO! 

t  =  r-T  \t-T\>T 

t^T 


(11) 


2. 

\\n\wNn,T  -  ||W|g|  <  7  E  M*)|  +  E  KWI  (12) 

t  =  -T  \t\>T 

0 

3.  For  the  special  case  H(X)  =  XEo  h(t)X*  (H  is  FIR), 

l|W|l2<M^,T<||W|g(l-7)  +  7  E  MOI  (13) 

r——T 

Proof:  (11)  follows  immediately  from  Lemma  1,  and  the  definition  of  WN>7tT-  Specializing 
on  r  =  0  gives  (12).  The  upper  bound  in  (13)  follows  from  (12),  the  lower  bound  from  the 
fact  that  the  delta  function  is  always  a  signal  in  the  set  WNtl>T. 

□ 

Remarks: 

1.  From  inequality  (11)  we  conclude  that  the  autocorrelations  of  the  time  series  y  (up 
to  a  constant  factor  ||w,||2)  lie  in  a  band  centered  in  the  autocorrelations  of  the  filter. 
Therefore,  such  a  band  is  a  natural  set  description  for  colored  noise,  the  output  of  a 
linear  filter  under 'white  noise. 
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2.  It  can  be  shown  (see  [14])  that  if  7  <  then  for  large  enough  N  the  upper  bound  in 
(13)  is  achieved.  This  is  no  longer  true  for  large  values  of  7;  for  example,  if  7  =  1,  there 
are  no  restrictions  on  the  input  signal,  and  the  induced  norm  can  be  bounded  by  the 
Hoo  norm  of  the  system  which  in  the  FIR  case  is  equal  to 

v 

COS  UT 

and  is  in  general  strictly  less  than  the  bound  (13).  The  role  of  7  in  this  deterministic 
approach  is  therefore  to  constrain  the  freedom  of  the  “adversary”  (the  disturbance), 
and  results  in  a  worst-case  gain  which  varies  from  the  H norm  for  7  =  1  to  the  TL2 
norm  for  7  =  0.  Ideally,  one  could  use  7  =  0  but  on  a  finite  horizon  setting,  this  would 
constitute  an  unrealistically  small  class  of  signals.  This  raises  the  issue  of  trading  off 
between  the  objectives  2  and  3  stated  in  Section  1:  obtaining  low  worst  case  gains  on 
sets  which  accommodate  a  reasonable  class  of  signals. 

To  analyze  this  we  turn  to  asymptotic  results,  as  the  length  N  of  the  data  record  goes  to 
infinity  and  find  rates  of  7,  T  that  achieve  this  compromise.  First,  we  give  conditions  under 
which  the  asymptotic  norm  is  the  H2  norm,  which  follow  obviously  from  Theorem  1: 

Corollary  1  1.  IfT  is  fixed,  TL  FIR  (T),  and  j(N)  0,  then  11^11^^^,  ^  ||?f||2. 

&  IfT(N)N-^?  00,  and  'y(N)  N-z^>  0,  H  HR,  then  \\H\\Wn^t  ^  ||W||2  . 

Secondly,  a  natural  requirement  for  a  set  description  as  in  definition  (1)  to  be  rich  enough 
is  that  the  set  have  “large”  probability  when  the  signal  effectively  comes  from  a  stochastic 
white  process:  this  is  how  these  sets  are  chosen  in  the  standard  statistical  approach.  A 
reasonably  general  answer  is  the  following  ( V  denotes  probability): 


sup  f  rA(0)  +  2^rft(r) 

"  V  T= 1 
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Theorem  2  Let  x(0),- ■  ■  ,x(N  —  1),  •••  be  independent,  identically  distributed  random  vari¬ 
ables,  with  0  mean  and  finite  variance.  For  each  N ,  denote  xN  =  (*(0),  •  •  • ,  x(N  —  1)). 

1.  If  T  is  fixed,  and  1(N)VN  N-^  oo,  thenV(xN  £  WN^T)  ^  1. 

2.  If  the  x(t)  are  bounded,  and  QO)  then  V(xN  £  Wjvl7ijv_i)  1. 

3.  If  the  x(t)  are  Gaussian,  and  7 (N)^  ^3  00,  t/ien  V(xN  £  Wjv,7,iv-i)  1. 

Proof:  See  the  Appendix. 

For  parts  2,  3,  T(N)  =  N  —  1  was  chosen  (all  the  entries  of  r^r)  are  constrained;  in  fact, 
it  is  equivalent  to  constrain  up  to  T(N)  =  |_yj).  This  type  of  result  is  different  from  those 
which  have  typically  been  considered  in  the  statistical  literature,  where  a  small  number  of  r 
values  is  employed,  and  substantial  “averaging”  is  available.  As  will  be  discussed  later,  there 
are  also  reasons  in  this  setting  to  reduce  the  number  of  constraints.  It  seems  interesting, 
however,  that  the  autocorrelation  plot  captures  “whiteness”  uniformly  across  all  values  of  r; 
this  is  a  strong  argument  in  favor  of  the  use  of  this  statistic.  This  result  also  supports  the 
notion  of  distance  employed  in  the  correlogram,  in  terms  of  the  vector  00  norm.  In  the  next 
section  we  will  see  that  the  “periodogram”  is  not  as  well  behaved. 

In  any  event,  by  showing  that  the  asymptotic  probability  of  is  1,  it  immediately 

follows  that  the  same  holds  for  WNiliT,  with  a  smaller  growth  rate  of  T. 

A  simple  way  to  summarize  the  preceding  results  in  relation  to  stochastic  white  noise,  is 
to  say  that  the  expected  gain  of  an  LTI  system  to  white  noise  (the  1L>  norm),  is  essentially  the 
same  as  the  worst  case  gain  of  the  system  in  a  set  of  signals  which  is  “typical”  from  the  point 


11 


of  view  of  the  probability,  when  the  mechanism  which  generates  the  disturbances  is  assumed 
to  be  stochastic. 

As  remarked  before,  this  assumption  cannot  be  directly  verified,  and  there  is  evidence  that 
non-stochastic  systems  (e.g.  deterministic  chaos,  see  [1]  and  references  therein)  can  produce 
similar  spectral  properties. 

Another  situation  where  disturbances  are  considered  is  as  “residuals”  of  some  system  iden¬ 
tification  technique,  i.e.  an  error  variable  needed  to  explain  the  experimental  data.  Though 
the  system  identification  theory  assumes  a  stochastic  model  for  this  disturbance,  in  practice 
it  always  includes  other  deterministic  (e.g.  nonlinear)  effects. 

The  previous  results  show  that  in  terms  of  rejection,  what  matters  is  the  statistical  infor¬ 
mation  (which  may  be  directly  tied  to  experiments),  not  the  generating  mechanism.  Auto¬ 
correlation  constraints  which  characterize  a  disturbance  (and  may  or  may  not  be  consistent 
with  the  levels  for  stochastic  noise)  can  be  incorporated  into  a  worst-case  rejection  measure. 

3.2  Infinite  horizon  descriptions 

We  conclude  the  section  by  considering  the  infinite  horizon  counterpart  of  definition  1.  For 
brevity,  we  will  treat  l2,  BV  signals  simultaneously. 

Definition  2  A  signal  in  l2  (or  in  BV)  is  said  to  be  white,  with  accuracy  7  up  to  time  lag  T 
if  it  satisfies 

K(r)|  <  7^(0)  t=1-“T  (14) 

The  set  of  all  such  signals  is  denoted  W7jt- 
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For  an  exponentially  stable  TL,  defining  the  norm  r  analogously  to  (10)  it  can  be 

concluded  in  a  similar  way  as  before  that 

\Ml  <  \\m\\ wy,T  <  IM 2  +  7  Mr)l  +  2  M*)l  (15) 

r  =  -T  \t\>T 

It  is  tempting  to  consider  the  set  W0tOO  =  { x(t )  G  BV,rx(r )  =  0  Vr  ^  0}.  In  this  class, 
(which  is  the  one  used  in  other  deterministic  spectral  analysis  treatments,  such  as  [10])  the 
induced  norm  is  exactly  HHH^r  =  ||?f||2,  and  moreover,  for  the  bounded  power  case  the 
class  contains  trajectories  of  stochastic  white  noise: 

Theorem  3  Let  x(0),  ■  •  ■ ,  x(t),  •  ■  ■  be  independent,  identically  distributed  random  variables, 
with  0  mean  and  finite  variance.  Then  V  (x  G  W0iOO)  =  1  (hF0)OO  C  BV). 

Proof:  For  a  fixed  r^O,  referring  to  [5]  (proposition  6.31),  we  find  that  the  random  process 

z(t)  =  x{t)x{t  +  r)  is  ergodic,  so  with  probability  1, 

1  N 

paTTT  ^  x +  =  +  r)a;(i)]  =  0  (16) 

Ziv  V  1  t=-N 

Therefore  W0:CO  has  probability  1  (countable  intersection  of  probability  1  sets). 

□ 

These  results  on  Wo)00  are  not,  however,  particularly  useful  for  the  following  reasons.  In  the 
first  place,  the  constraints  on  a  bounded  power  signal  depend  exclusively  on  its  asymptotic 
behavior:  any  sequence  in  is  a  valid  truncation  of  a  white  power  signal.  From  a  practical 
perspective,  it  is  impossible  to  know  whether  a  disturbance  is  in  Wo,oo{BV),  just  as  it  is  to 
verify  that  a  signal  is  generated  by  a  stochastic  white  process.  In  this  respect,  it  seems  that 
the  l2  version  is  a  better  behaved  infinite  horizon  abstraction;  if  a  signpl  in  l2  is  truncated  with 
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large  enough  N  so  as  to  capture  most  of  its  energy,  then  the  autocorrelations  are  essentially 
determined  by  the  truncation. 

Also,  when  dealing  with  uncertain  systems,  constraints  on  the  signals  must  be  enforced 
explicitly  in  any  analysis  or  synthesis  procedure,  as  will  be  discussed  below.  The  definition 
of  W0,oo  requires  an  infinite  number  of  constraints,  which  cannot  be  handled.  In  the  case 
of  a  finite  number,  such  as  in  W7)t  (W0tT  in  particular),  these  constraints  can  be  naturally 
introduced  into  a  robustness  analysis  or  synthesis  problem,  as  is  shown  in  [15]. 

4  Frequency  domain  descriptions 

4.1  Finite  horizon  descriptions 

In  the  frequency  domain,  the  natural  object  of  study  is  the  power  spectrum;  as  the  name 
implies,  a  “white”  signal  is  characterized  by  a  flat  power  spectrum.  Referring  to  finite  length 
signals,  presumably  a  set  characterization  of  whiteness  can  be  obtained  by  specifying  the 
periodogram  sx(k)  to  be  close  to  a  constant  across  k ;  it  is  difficult,  however,  to  find  a  notion 
of  distance  in  which  this  holds  for  typical  white  noise  signals.  Figure  2  shows  a  typical 
periodogram  of  a  signal  obtained  from  a  pseudorandom  number  generator.  As  we  can  see, 
the  periodogram  is  very  erratic,  and  is  not  close  to  its  average  value  in  a  pointwise  sense. 
Various  authors  (see  [6]  and  references  therein)  have  studied  the  stochastic  properties  of  the 
periodogram.  In  the  case  of  Gaussian  noise,  for  example,  the  following  can  be  shown: 

Proposition  1  Letx(t),t  —  0  •  •  -  N  - 1,  be  independent,  complex  Gaussian  random  variables, 
of  mean  zero  and  variance  1.  Then  k  =  0  •  •  -N  -  1  are  independent  random  variables  of 
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Figure  2:  Periodogram  of  a  pseudorandom  sequence 

exponential  distribution,  with  expected  value  1,  and  the  expected  value  of  maxk?J!jp-  grows  as 
Log(N)  when  N —> oo. 

This  means  that  if  we  attempt  to  characterize  white  noise  by  a  band  of  power  spectra 
around  a  constant  value,  this  band  would  have  to  grow  arbitrarily  to  be  able  to  capture 
stochastic  noise.  It  is  not  hard  to  show  that  the  worst  case  gain  under  signals  in  such  a  band 
would  approach  the  norm  of  the  system;  therefore,  this  description  would  not  be  tight 
enough  for  our  purposes.  Sets  of  spectra  defined  in  terms  of  other  simple  vector  norms  can 
also  be  shown  to  be  not  satisfactory  for  similar  reasons. 

The  fact  that  a  “raw”  periodogram  of  a  noise  signal  is  not  a  very  well  behaved  statistic 
has  long  been  recognized  in  the  statistical  spectral  analysis  community  (see  [3]).  Peaks  in 
the  periodogram  do  not  necessarily  correspond  to  underlying  periodicities  in  the  time-series, 
and  from  this  point  of  view  the  autocorrelation  plot  is  more  significant.  Another  way  to  say 
this  is  that  the  frequency  domain  is  not  a  natural  set  of  “coordinates”  to  uncover  trends  in 
noisy  data.  Rotating  the  data  back  to  the  time  domain  (autocorrelation  plot)  gives  a  tight 
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description  for  “whiteness”:  small  distance  to  the  delta  function  in  the  vector  oo  norm. 

There  is,  however,  a  way  around  this  difficulty  that  has  been  used  extensively  in  statistical 
spectral  analysis,  in  terms  of  smoothing  of  the  periodogram  (see  [18,  6]):  adequate  local 
averaging  in  the  periodogram  reveals  the  process  spectral  information.  In  this  section,  we 
will  pursue  the  same  smoothing  approach  to  provide  set  descriptions  of  white  noise.  Instead 
of  smoothing  by  convolution  (as  in  [14])  in  this  paper  we  will  adopt  an  approach  of  averaging 
in  a  set  of  bands:  given  a  uniform  filter  bank  Vm(k),  m  =  0  ■  •  -M  —  1,  a  white  signal  will  be 
characterized  by  having  approximately  equal  energy  on  these  bands.  Various  designs  for  the 
filter  bank  could  be  considered;  in  this  paper  we  will  assume  for  simplicity  that  the  length  of 
the  signal  is  a  integer  multiple  of  the  number  of  bands  N  =  MK,  and  that  the  filter  bank  is 
made  of  ideal  bandpass  filters  of  the  form 


Vm(k)  = 


T  rnK  <  k  <  (m  +  1  )K 
0  otherwise 


(17) 


Definition  3  Let  N  =  MK .  Define  the  filter  hank  V  =  {Vm(k),  m  =  0  •  •  -M  —  1}  as  in  (17) 
A  signal  x( t)  of  length  N  is  said  to  be  white  with  accuracy  a,  with  respect  to  the  filter  bank 
V,  if  its  periodogram  sx(k )  verifies 

max^^  <  1  +  a  (18) 

-  INI2  " 

We  denote  the  set  of  all  such  signals  as  Wjy.a.v  • 


In  the  definition,  the  inner  product  (sx,Vm)  averages  the  periodogram  in  the  band.  The 
requirement  is  that  the  (normalized)  band  averages  be  close  to  the  global  average  of  1  (since 
these  are  nonnegative  quantities,  constraining  from  above  suffices).  With  these  definitions, 
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results  which  parallel  those  in  the  time  domain  can  be  written.  The  worst-case  induced  norm 


of  a  system  7i  under  signals  in  the  set  Wn,o,v  will  be  denoted  ||?7|| 


Theorem  4  Consider  a  stable  LTI  system  TL,  such  that  B  =  max,,  \  is  finite. 


Then 


B  7T 


Bit 


i  m;  -~m<  iiwii  wN,a,v  <  mi  (i  + «)  + 17 


M 


(19) 


Proof:  Denote  h„ 


~  2  TC 

If  mK  <  k  <  (m  +  T)K,  then 

2jr(m+l) 

M 


—  /to  \H{eiw)\2duj.  Note  that  A  X)m=o  hm  =  ||W||2. 


_ x ...  t  27r(m-f-l) 

i*» - i^)i2i  s  sL"  1 


2-7T fc  . 


B  7T 


lw  —  m" 


(20) 


Let  fi(i)  be  a  signal  in  |M|2  =  ^  J2k=o  su(k)  =  1,  y  the  corresponding  output. 

N- 1 

JV  ^  ^(*)|JT(fc)la  =  u 


Irf 


-|  jv  —  i  -j  M  —  lN—l 

-  y.  =  -u  £  E  ^mm^) 

k—O  1  m=0  k-0 

M  —  l  N—l  M-lN-1 


-t  IVl  —  i  JV  —  1  . 

57  E  '*■»  £  '.(W»(*)  +  Jj  £  £  *.(*)(W*)IJ  -  MM*)  (21) 


M 


m=0  fc=0 

For  the  second  term  in  (21),  the  bound  (20)  gives 
M-lN-1 


m=. 0  k=0 


_l_ 

M 


£  Y,  Sn(k)(\H(k)\2  ~  hm)Vm{k) 


m=0  &=0 

Since  u  E  Wjv)a>v, 


/  1  M-1JV-1  x 

u(k)Vm(k) 


m— 0  fc=0 


i?7r 

H 


(22) 


JV-l 


{su,Vm)=  Js.Ma  +  Q  (23) 

k=0 

Also,  for  signals  with  su(k)  =  1  the  left  hand  side  of  (23)  achieves  the  value  1.  Therefore  the 
supremum  of  the  first  term  in  (21)  is  bounded  between  \\H\\22  and  (1  +  «)  Incorporating 

(22),  the  supremum  of  ||y||-  is  between  the  bounds  in  (19). 
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□ 


Corollary  2  In  the  conditions  of  Theorem  4 ,  if  a(N)  NsTT  and  M(N)  oo,  then 


(24) 


Theorem  5  Let  a;(0),  •  •  -,x(N  —  1),  •  •  •  be  independent,  Gaussian,  zero  mean  random  vari¬ 
ables.  For  each  N ,  denote  xN  =  (x(0),  •  •  • ,  x(N  —  1)).  Assume  a(N)  o,  Af(JV)  oo, 


and  MLof(M)  ^  °°  •  TheU 


V  (a: at  E  Wjv>,v) 


N-+  oo 


(25) 


Proof:  See  the  Appendix. 

The  previous  statements  provide  the  frequency  domain  counterpart  of  the  time  domain 
asymptotic  results.  Provided  a(N)  0,  M(N )  oo  with  appropriate  rates  (e.g. 

M(N)  =  y/N,  ot(N)  =  the  worst  case  disturbance  rejection  measure  approaches  the 

'H2-novm  of  the  system,  while  the  class  of  signals  contains  asymptotically  all  typical  instances 
of  stochastic  white  noise,  at  least  in  the  Gaussian  case. 

We  have  introduced  essentially  dual  descriptions  in  time  and  frequency  domains,  of  pa¬ 
rameterized  sets  which  describe  white  noise.  With  the  chosen  definitions,  however,  these  sets 
are  different:  the  time  and  frequency  domain  constraints  do  not  correspond  to  each  other 
through  DFT.  Such  a  correspondence  can  be  achieved  if  the  Euclidean  norm  is  used  to  mea¬ 
sure  distance  in  both  domains  (the  DFT  is  unitary),  and  smoothing  in  the  frequency  domain 
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is  performed  by  convolution.  The  resulting  signal  constraints  are  more  complicated,  however, 
since  they  involve  fourth  powers  of  the  signals. 

As  mentioned  in  Section  1,  an  important  objective  in  the  choice  for  these  descriptions  is 
their  simplicity  and  compatibility  with  representations  for  uncertainty.  From  this  point  of 
view,  constraints  which  are  quadratic  in  the  signals  involved  are  especially  adequate.  This  fact 
has  been  recognized  in  the  work  of  Yakubovich  [20]  and  Megretski  [13],  where  uncertainty  is 
described  in  terms  of  integral  quadratic  constraints  (IQCs).  Alternatively,  in  [15]  it  shown  how 
these  set  characterizations  for  white  noise  can  be  easily  lit  into  the  standard  linear  fractional 
transformation  (LFT)  framework  for  uncertainty,  expressing  these  sets  as  kernels  of  LFTs  on 
uncertain  operators.  This  is  the  underlying  motivation  for  our  particular  choices  for  the  sets. 

In  any  event,  for  large  N  and  these  choices  of  parameters,  both  time  and  frequency 
definitions  capture  sets  of  signals  which  are  “likely”  if  we  adhere  to  a  probabilistic  model, 
and  which  produce  approximately  the  same  worst-case  disturbance  rejection  measure;  from 
the  point  of  view  of  the  objectives  of  this  paper,  they  are  equivalently  adequate  descriptions 
for  white  noise. 

4.2  Infinite  horizon  descriptions 

An  infinite  horizon  (in  /2>  or  BV)  counterpart  of  the  set  WN>0ly  is  now  given.  A  filter  bank 
V  =  {Vm(eJ"),  m  =  0  •  •  -M  —  1}  is  used,  where  the  integral  across  frequency  of  each  Vm  is  1. 

Definition  4  Given  a  filter  bank  V  =  {Vm(e:>U)),m  =  0  •  ■  ■  M  —  1} ,  an  l2  (or  BV)  signal  x(t) 
is  said  to  be  white  with  accuracy  a,  with  respect  to  the  filter  bank  V ,  if  its  spectral  density 
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sx(e^w)  verifies 


max  ■ 
m  2  7T 


^  jf  '  s*(e^)Fm(e^)du>  <  (1  +  <*)-^  J  W  sx(ei“)d< 

We  denote  the  set  of  all  such  signals  as  Way. 


(26) 


For  the  case  of  an  filter  bank  with  ideal  bandpass  filters, 


Vm(en 


M  <u  <  2^Z+1) 

M  —  —  M 

0  otherwise 


(27) 


we  arrive  in  a  similar  manner  to  the  bound 


|7f||22<||H||2^<||^(l  +  a)+  M 


7 TB 


(28) 


5  The  multivariable  case 

This  section  outlines  how  the  previous  methodology  can  be  extended  to  deal  with  vector 
valued  white  noise  signals.  For  reasons  of  brevity  and  simplicity  we  will  only  consider  the 
case  of  infinite  horizon  l2  signals,  which  demonstrates  all  the  necessary  extensions. 

For  vector-valued  signals  x(t)  £  l2(Rn),  the  matrix  autocorrelation  (prime  denotes  trans¬ 
pose,  *  denotes  conjugate  transpose)  is  given  by 

OO 

Rx(t)  =  +  T)x'(t)  (29) 

t=—co 

and  the  matrix  spectrum  by  Sx(eJ^)  =  X{ejw)X*(eju),  where  X(ejw)  is  the  Fourier  transform 
of  x(t).  They  are  related  as  in  (8).  The  2-norm  of  the  signal  verifies 

1  f 2?r 

U^llo  =  trace(Rx(  0))  =  —  /  trace{Sx(eiu,))du>  (30) 

2ir  Jo 

Consider  a  stable,  discrete  time  linear  time  invariant  system  with  in  general  n  inputs 
and  p  outputs,  77(A)  =  J2tL-oo  H(t) A4,  with  frequency  response  77(eJa').  We  define  Rh(t)  = 
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ITte-oo  H(t  +  r)H'(t )  and  5#(eJ‘‘')  =  H(ejw)H* (e-7").  The  2  norm  of  H  satisfies  the  relation 
(30).  If  u(t )  G  Z2(M.rt) ,  y(f)  G  /2(^p)  are  respectively,  the  input  and  output  to  7i,  then  the 
following  relations  hold: 

oo  oo 

Ry{r)  =  £  £  H(s)Ru(r  +  t-s)H'(t)  (31) 

t  —  —  OO  S  —  —  oo 

Sy(eju)  =  H(ejw)Sx(eju)'H(eiuy  (32) 

Now  we  give  set  descriptions  of  vector  valued  white  noise.  In  the  time  domain,  Rx{r ) 
should  be  small  for  t  ^  0,  and  that  Rx(0)  must  be  approximately  a  constant  times  the 
identity  matrix.  This  implies  that  in  addition  to  the  components  of  x(t )  being  scalar  white 
noise  signals,  they  must  be  “spatially”  uncorrelated. 

Some  matrix  norms  will  be  used  in  the  following: 

2 

Note  that 

\trace(AB)\  <  PP  \\BWj_  \trace(AB)\  <  \\A\\F  ||5||F  (33) 


IM 


max  |  cii 


Ml,  =  Ei 


I? 


\\A\\F  =  (trace(A' A))* 


VI 


Definition  5  A  signal  x(t)  G  pV)  is  said  to  be  white  with  accuracy  7  up  to  time  lag  T  if 
it  satisfies 


11*11 


<  7 


We  will  denote  the  set  of  all  such  signals  by  W*T . 


(34) 


In  the  definition,  S(t)  is  the  usual  delta  function,  and  the  norm  H-]^  referred  to  in  the 
definition  can  be  in  principle  any  matrix  norm.  The  norm  H’P  has  the  advantage  of  giving 
quadratic  constraints  on  the  signal.  Defining  HTtp*  as  in  (10),  we  have: 
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Theorem  6  Defining  WfT  in  terms  of  the  matrix  norm  IHI^ 


-i.t  n 

Proof:Let  ||m||  =  l,u  G  W"iT.  Starting  from  (31) 


<  7  E  IMMOII,  +  E  pH'WII. 

t=-T  \t\>T 


(35) 


oo  oo 


trace(Ry( 0))  =  trace(Ru(t  —  s)H'(t)H(s))  =  ^  trace{Ru{r)RH'{T )) 


1  —  —  00  S  —  —  OQ 


1  °°  /  1 
trace(fZy(0))  -  —  ||?f||2  =  ^  trace  (  (Ru(r)  -  —6(t)I)RH'(t) 


Using  (33),  and  the  fact  that  ||f?u(r)||F  <  ||w||2  for  r  /  0  we  have 

n  ^„.T  -  i  iiwiG  <  E 


t=  —  T 


1 


5u(r)  -  -*(r)I 
n 


^(r)|U  £  H-SuWIIf  < 

hl>T 


<  7  E  pH'WII.  +  E  Pir-WIt 

t——T  \t\>T 


□ 


Under  similar  growth  conditions  as  in  Corollary  (1),  the  asymptotic  norm  is  ^  ||?f||2 
For  a  frequency  domain  characterization,  consider 


Definition  6  Given  a  filter  bank  V  =  {Vm(e^w),m  =  0  ■■■M  —  1),  an  /2(Kn)  signal  x{t) 
is  said  to  be  white  with  accuracy  a,  with  respect  to  the  filter  bank  V ,  if  its  spectral  density 
Sx(e^)  verifies 


max 

m 


l 


27 r  ||a:| 


■  Sx{enVm{endu  -  -I 

Jo 


n 


<  a 


(36) 


The  set  of  all  such  signals  is  denoted  W£ 


v 


Will  not  pursue  the  subsequent  analysis,  which  follows  the  same  lines  as  before. 


22 


6  Conclusion 


In  this  paper,  set  characterizations  of  white  noise  in  terms  of  constraints  in  signal  space  were 
presented.  It  was  shown  how  these  sets  can  be  “tailored”  to  adequately  capture  stochastic 
noise,  retaining  its  properties  in  terms  of  system  gain,  now  understood  in  a  worst  case  setting. 
The  parameterization  allows,  however,  a  greater  flexibility  in  signal  characterization,  and  the 
finite  horizon  version  allows  these  descriptions  to  be  tied  directly  to  experimental  data. 

The  bounds  obtained  for  worst-case  gain  on  these  sets  of  signals  are  useful  in  showing 
that  this  procedure  is  sound  and  consistent  with  the  alternative  stochastic  approach,  but 
they  are  not  exact  and  too  complicated  to  provide  a  basis  for  robust  performance  analysis 
when  the  system  H  is  subject  to  uncertainty.  The  major  argument  given  in  Section  1  in  favor 
of  adopting  these  deterministic  descriptions  was,  after  all,  to  unify  white  noise  rejection  with 
robustness  analysis.  Fortunately  there  is  an  elegant  framework  (developed  fully  in  [15])  which 
encompasses  our  deterministic  descriptions  of  white  noise  with  other  forms  of  uncertainty  in 
the  system.  This  framework  relates  to  the  IQC  [20,  13]  formulation,  and  to  recent  develop¬ 
ments  in  uncertain  behavioral  systems  [7].  The  resulting  methods  for  “Robust  7f2”  analysis 
and  synthesis  have  been  pursued  in  [15,  8]. 

This  framework  can  be  extended,  to  some  degree,  to  continuous  time  systems.  While 
“pure”  white  noise  is  a  difficult  object  to  define,  it  is  clear  that  useful  approximations  can  be 
obtained,  for  example,  from  sets  of  signals  defined  in  terms  of  frequency  domain  characteri¬ 
zations,  in  a  similar  manner  to  the  discrete  time  case. 

The  finite  horizon  version  of  this  framework  can  also  be  applied  to  the  area  of  worst-case 
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system  identification.  Recent  work  [9,  16]  has  shown  that  if  noise  disturbances  are  allowed  to 
be  arbitrary  norm  bounded  signals,  the  identification  problem  has  high  computational  com¬ 
plexity.  It  is  to  expect  that  constraining  the  disturbance  in  the  style  of  this  paper  (constraining 
the  freedom  of  the  “adversary”  in  the  identification  problem)  will  bring  some  reduction  in 
complexity. 


Appendix 

The  stochastic  results  will  be  proved  here.  In  the  sequel,  x(0)  ■  ■  -x(N  —  1)  •  •  •  are  independent, 
identically  distributed  random  variables,  of  0  mean  and  finite  variance.  Since  the  sets  Wjvi7,t 
and  WNiay  are  closed  under  scalar  multiplication,  the  variance  can  be  normalized  to  1. 


Proof  of  Theorem  2: 

Part  1:  For  the  case  of  a  fixed  time  lag  r,  the  distribution  of  the  autocorrelation  rx(r) 
has  been  extensively  studied  in  the  statistical  literature  [3,  2,  12];  exact  expressions  for  the 
distribution  of  rx(r)/rx(0)  when  x{t)  is  Gaussian  are  obtained  in  [2],  and  asymptotic  normality 
holds.  We  outline  a  proof  for  completeness. 

A  central  limit  theorem  on  the  r-dependent  (see  [4])  random  variables  z(n )  =  x(n)x(n  +  T ) 
shows  that  Yln=a~l  x(n)x(n  +  r)  is  asymptotically  normal  A/"(0, 1).  Since  r  is  fixed,  the 
“circular”  terms  are  vanishingly  small,  and  the  same  holds  for  ^=rx(r).  Also  ^7^(0)  converges 
almost  surely  to  1,  so  is  asymptotically  normal.  Since  7 y/N  —*  00,  and  T  is  fixed, 


V(xtWN„,T)<j2V 


T—  1 


Vn 


rx(T) 


rx(  0) 


>  7  y/N 


N—*oo 


(37) 
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□ 


In  parts  2,  3  of  the  theorem,  the  number  of  correlation  constraints  grows  with  the  sample  size, 
and  the  argument  with  the  normal  approximation  cannot  be  used:  even  though  each  rx(r) 
for  fixed  r  is  asymptotically  normal,  the  joint  distribution  of  (rx(l),  •  •  -rx(N  —  1))  is  defined 
on  a  space  of  increasing  dimension,  where  no  global  “averaging”  occurs.  Our  proof  relies  on 
a  Hoeffding  inequality  for  sums  of  bounded  random  variables,  [11]: 

Theorem  7  (Hoeffding)  Let  z0,---zN_ i  be  independent  random  variables,  of  mean  p  and 
bounded  (a  <  zn  <  b),  z  =  jj  J2n=o  zn-  Then  for  e  >  0, 

-2Ne2 

V  (z  -  p  >  e)  <  e^-*)2  (38) 

To  apply  this  inequality  to  the  sum  rx(r)  =  J2n= "o  ^(t),  with  z{t)  =  x(t)x((t  +  r)modIV),  the 
following  lemma  takes  care  of  of  the  slight  dependence  between  the  terms,  by  dividing  the 
sum  in  three  sums,  ensuring  z(t),  z((t  +  rjmodiV),  z((t  —  r)modiV)  fall  in  different  groups  for 
each  t.  We  omit  the  elementary  proof,  which  requires  a  discussion  on  the  values  of  N  and  r. 

Lemma  2  Let  N  >  3,  and  x(Q),x(l)  ■  ■  -x(N  —  1)  be  independent  identically  distributed  ran¬ 
dom  variables.  For  fixed  1  <  r  <  there  exists  a  partition  (depending  only  on  N ,t)  of  the 
terms  in  the  sum  rx(r)  into  three  groups,  giving  rx(r)  =  +  5i  +  S2,  where  each  S ,•  is  the 

sum  of  n,  independent  random  variables,  and  n{  >  fr  . 

Part  2:  Assume  x(0)  ■  •  -x(N  —  1)  are  bounded  random  variables,  |z(t)|  <  K.  Pick  1  < 
t  <  y.  From  the  lemma  rx(r)  =  F0  +  5i  +  S2,  where  each  S is  the  sum  of  n ,  independent, 
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identically  distributed  random  variables,  with  zero  mean  and  bounded  in  [— K2,  K 2].  Invoking 
Hoeffding’s  inequality  and  nl  >  — ,  we  have 


V 


r*(T) 

N 


i= 0 


> e )  <  Y, v  ( ~  > e ^  D e  ^  ^ 3e 


2 

£ 

2—0 


(39) 


The  same  argument  can  be  employed  to  bound  V  //  >  e),  for  each  value  of  r.  This 
implies 

V  (  max  >  e)  <  3Ne ^  =  3eL^W(i- 


(40) 


y<r<f  IV 

By  symmetry  of  rx(r),  W^ntN-i  =  WNrt  K.  Now  choose  0  <  p  <  1.  The  complement  of 
Wjv,7,jv-i  can  be  written  as 


W^N-'  "  li<"<»  r,(0) 


max  —T  >  7  j  C  ( |  niax^  >  7j>j  U  ( ^  <  ?}  (41) 


t  =  0 


The  probability  of  the  first  set  is  bounded  by  (40),  setting  e  =  qp.  The  probability  of  the 
second  set  can  be  bounded  by  another  use  of  the  Hoeffding  inequality,  applied  to  the  bounded 
IID  random  variables  x(t)2.  Putting  everything  together, 


V 


K^-0  ^  *™h*&*) 


+  e  k4 


(42) 


The  second  term  clearly  goes  to  to  zero  as  IV— > oo,  and  the  same  happens  with  the  first  term 

IV— .oo 


since 


00. 


by  hypothesis  7 

Part  3:  Assume  a:(0)  •  •  • x(N  —  1)  are  Gaussian  random  variables,  x(t)  ~  A/"(0, 1).  Choos¬ 
ing  K(N )  =  a/2 Log(N),  define  the  random  variables  v(t),t  =  0  •  •  •  N  —  1  by  truncation: 


v(t) 


x(t )  if\x(t)\  <  K(N ) 
0  otherwise 


(43) 
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c 


(44) 


V{x±v)<  NV (x(t)  ±  »(<))  =  NP(\x(t)\  >  K(N ))  <  -  -  2Log{N) 

In  (44)  x  —  (a:(0)  •  •  -x(N  —  1)),  v  =  (u(0)  •  •  • v(N  —  1)),  and  the  second  inequality  follows  from 
a  standard  bound  to  the  tail  of  the  normal  distribution  (C  is  a  constant).  Observing  that 


V  (x  £  Wjv,7,jv-i)  <  V  (u  0  WNntN_i)  +  V(x^v) 


(45) 


it  remains  to  show  that  V  {v  £  Wn^^n-i)  also  vanishes  as  IV— >oo.  Since  the  variables  v(t)  are 
bounded  by  K(N),  (42)  gives 


V  (v  $  Wn,j,n-i)  <  3e 


ljSK.^L,og{Isi) 


+  e 


(46) 


The  second  term  clearly  has  limit  0  as  IV— > oo.  The  first  term  also  goes  to  0,  since  by  hypothesis 

lM£r)  =  ‘tlSnt  goes  to  infinitV- 

□ 


Proof  of  Theorem  5: 

For  simplicity  we  consider  the  complex  Gaussian  case.  From  Proposition  1,  ^jp,  0  <  k  < 
N  —  1  are  independent,  exponential  random  variables  with  mean  1.  (for  the  real  case,  the 
same  holds  for  0  <  k  <  y,  but  sx(0),  sx(y)  have  a  different  distribution,  and  the  spectrum 
is  symmetric  around  y;  the  proof  can  be  extended  to  this  case).  If  AK  is  the  sum  of  K 
independent,  exponential,  mean  1  variables,  the  bound 

V{Kk>u)<(E±P'J  (47) 
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holds  for  any  positive  integer  p.  Let  e  =  K  =  jh.  We  can  write 


W&a.v  C 


1  I.  ||  2 

N  I|X|1  < 


(1  -  €)}  U  U  >(l  +  a)(l  -e)} 


(48) 


Focusing  on  each  set  in  the  second  union,  K{&,Vm)  is  the  sum  of  K  exponential,  mean  1 


random  variables.  Choose  p  =  [^J.  Using  (47), 


V  ((j  {(^,Vrm)>(l  +  a)(l-€)}j  <M( 


K  +  P 


K(  l  +  a)(l-€) 


LogM 


=  e 


^  LrjaM  L°9 


(  p+">o.-o.) 


(49) 


As  N—> oo,  a— >0,  and  ~  e  =  §.  This  §ives 


P 


LogM 


Log 


(l  +  q)(f-e) 

1  +  2- 
1  r  K 


p  a  K  a2 


N  a2 


N—*oo 


LogM  3  9 LogM  9MLogM 


oo. 


(50) 


Therefore  the  probability  in  (49)  goes  to  zero  as  N-* oo.  As  for  the  first  set  in  (48), 


N- 1 


\ 


'P  (^ll^ll2  <  =V  Z)(®(*)2  “  X)  < 


N—*oo 


(51) 


since 


VNe  > 


VkM  N-+co 


oo ,  and  using  asymptotic  normality. 


□ 
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